Generalization of generative model for neuronal ensemble inference method

Various brain functions that are necessary to maintain life activities materialize through the interaction of countless neurons. Therefore, it is important to analyze functional neuronal network. To elucidate the mechanism of brain function, many studies are being actively conducted on functional neuronal ensemble and hub, including all areas of neuroscience. In addition, recent study suggests that the existence of functional neuronal ensembles and hubs contributes to the efficiency of information processing. For these reasons, there is a demand for methods to infer functional neuronal ensembles from neuronal activity data, and methods based on Bayesian inference have been proposed. However, there is a problem in modeling the activity in Bayesian inference. The features of each neuron’s activity have non-stationarity depending on physiological experimental conditions. As a result, the assumption of stationarity in Bayesian inference model impedes inference, which leads to destabilization of inference results and degradation of inference accuracy. In this study, we extend the range of the variable for expressing the neuronal state, and generalize the likelihood of the model for extended variables. By comparing with the previous study, our model can express the neuronal state in larger space. This generalization without restriction of the binary input enables us to perform soft clustering and apply the method to non-stationary neuroactivity data. In addition, for the effectiveness of the method, we apply the developed method to multiple synthetic fluorescence data generated from the electrical potential data in leaky integrated-and-fire model.


Introduction
In many organisms, various functions necessary to maintain life activities such as perception, movement, and emotion are realized through the interaction of countless neurons in multiple regions of brain. Each neuron transmits information by chemical and electrical signals through synaptic and gap junctions. Therefore, from the perspective of network science, analysis of neuronal network is important for elucidating the mechanisms of brain functions. As such, in many fields of neuroscience, various studies are being conducted to elucidate the structure and function of neuronal networks from experimental and theoretical viewpoints. In fact, in the field of neurophysiology, measurement techniques have been developed for the purpose of simultaneously visualizing the activity of neurons in multiple brain regions. One of the main methods for measuring neuronal activity at the level of a single neuron is calcium imaging, which can measure the activity of a large number of neurons on the 10 4 order at a time [1,2]. In our study, we define functional connections as virtual connections, which are determined based on the synchronization of activities among neurons, as opposed to structural connections representing actual synaptic or gap junction. Of course, structural connection is significant for clarifying structural neuronal network in the brain. However, it is difficult to infer structural connection from activity data because unobserved neurons may influence the activity of observed neurons in measurement experiment through hidden structural connections. For this reason, the study on the inference of functional connectivity from activity data has mainly and intensively been conducted. For example, the functional connection strength is evaluated as time-averaged quantity in the methods in the previous studies [3][4][5].
In addition to the functional connection, functional neuronal ensembles are defined as groups of neurons with virtual functional connections determined by the synchronization of their activities. It should be emphasized that the synchronous activities of neurons do not necessarily depend on the presence or absence of direct structural coupling, but may also on indirect structural coupling through multiple neurons. Such functional neuronal ensembles play a specific role in the information processing in the brain, and various methods for inferring functional neuronal ensembles have been proposed [6][7][8]. It can also be used for limiting the data to be analyzed as preprocessing of local functional network inference. As another significant topic, it has been suggested that hub neurons, which have functional connections with many neurons and influence them in neuronal network, are involved in the efficiency of the information processing between neurons [9]. Thus, inference of functional neuronal ensembles centered on hub neurons is important not only as a preprocessing for local functional network inference but also for elucidating the global mechanism of brain functions.
Among the methods of functional neuronal ensemble, Bayesian inference model and Markov chain Monte Carlo (MCMC) method are used in the previous study [10], whose improvement for faster convergence to inference result has also been proposed [11]. The method based on Bayesian inference needs a generative model, where only binary expression of neuronal activity data is allowed in the previous studies. For this reason, we call the method in the previous studies [10,11] as binary inference method (BIM) in the following. Apparently, BIM cannot be applied to activity data of continuous values such as fluorescence intensity by calcium imaging. Of course, it is possible to apply BIM to continuous-valued data after binarization. However, it is desirable to analyze continuous-valued data directly, because part of the original information in raw activity data may be lost by binarization. Another problem in BIM is the assumption of stationarity in the activity, which means that functional neuronal ensemble is assumed to be unchanged throughout the measurement experiment. However, functional neuronal ensembles will change depending on the brain state and experimental task. This is because actual neurons often have multiple functional roles depending on brain states such as sleep and wakefulness. Furthermore, responses to stimuli are different due to various brain states. Therefore, when BIM is applied to highly non-stationary activity data with frequent changes of brain states [12,13], the inference result by MCMC may be unstable due to the dependence on initial condition in MCMC, and accuracy of the inference may worsen. Additionally, such disadvantages are also due to hard clustering in BIM, where each neuron must belong to a particular ensemble at any time of experiment. Because functional ensembles change over time in non-stationary data, hard clustering method is not suitable.
For these reasons, we develop a generalized method for functional ensemble inference in this study, which does not depend on the format of the neuronal activity data and can be implemented as a soft clustering method, where each neuron may belong to multiple ensembles. In our generative model, the variable representing the input neuronal activity is continuous, and accordingly the limitation of the binary data format is removed. In addition, by expressing the neuron assignment to ensemble as continuous-valued weight vector rather than categorical variable, we can change the hard clustering in BIM to soft one. This can also be regarded as a generalization of the model because soft clustering includes hard one. The generalizations to continuous input activity and soft clustering allow us to widen the range of variables representing neuronal activity in the model. In other words, we can enhance the expressivity of our generative model for neuronal activity, where expressivity means the range of the variables in the inference model in mathematical sense. Furthermore, we also develop an algorithm using MCMC for our generalized model.
For validity, we apply our proposed method to synthetic fluorescence intensity data. For synthetic data generation, we first generate time series data of electrical membrane potential in neurons under external input current by leaky integrated-and-fire model [14]. Next, membrane potential is converted to fluorescence intensity by considering the measurement process of experimental calcium imaging. After fluorescence intensity data is obtained, we apply our proposed method to infer functional neuronal ensembles. Finally, we discuss the results in several cases of external input currents by tuning the parameter of soft/hard clustering, and compare the results by our proposed method and BIM.

Bayesian inference model
The notation of variables in this article is based on the previous work [10]. The differences in the notation from the previous study due to the generalization of the model will be summarized later. Boldface is used to denote a vector or matrix variable unless otherwise specified. Additionally, subscript is used to denote each element in a vector or matrix variable.
In our model, let N be the number of neurons, M be the number of time steps in activity measurement, and A be the number of neuronal ensembles. The generative model in our proposed Bayesian inference method is expressed using three variables: s is neuronal activity; t is the weight of a neuron belonging to a neuronal ensemble; and ω is time-series activity state of a neuronal ensemble. The variable s representing neuronal activity is given as input. Accordingly, the number of neurons N and the number of time steps M are determined from the input. In BIM and our proposed method, the number of neuronal ensembles is changed dynamically. Hence, inferences of ensemble structure and the number of ensembles are performed simultaneously from given activity data. Therefore, the number of neuronal ensembles A is an estimator, which is a natural number between 1 and N for hard clustering in BIM. However, due to the extension of our proposed method to soft clustering, A may be larger than the number of neurons N, and the possible value of A is in the whole natural number. In practice, the number of ensembles is expected to be sufficiently smaller than N if an appropriate ensemble structure is estimated, therefore our proposed method provides an upper bound on A as a parameter. The estimation of A will be described again in the description of the algorithm. Next, the subscripts i, k, and μ used in elementwise notation of each variable represent the label of the neuron, the label of the time step, and the label of the neuronal ensemble, respectively. Therefore, each variable represents the following.
• s ik : neuronal activity of the ith neuron at the kth time step • t μi : membership weight of the ith neuron to the μth ensemble • ω kμ : activity of the μth ensemble at the kth time step The ranges of variables s and ω are limited as 0 � s ik � 1 8i, k and 0 � ω kμ � 1 8k, μ, respectively, to normalize the activity weights of each neuron and ensemble. The variable t i = {t 1i , � � �, t Ai } is one-hot representation or t μi 2 {0, 1} in the case of hard clustering, while t μi ranges from 0 to 1 and satisfies P A m¼1 t mi ¼ 1 8i in the case of soft clustering. In comparison with BIM, the input neural activity s and ensemble activity ω are changed from 0-or-1 binary variables to continuous ones in the range [0, 1], and the weight t of neurons is changed from categorical variables to continuous-valued vector variables. The range [0, 1] for s and ω is designed to normalize the amount of information in the activity of each neuron or ensemble. In addition, the number of ensembles initially given in MCMC is used as the upper bound of A, which is denoted by A init in Table 1. The estimation of the appropriate A and ensemble structure under given upper bound are common both in BIM and our proposed model. The detail of the variation in the number of ensembles will be explained later in the algorithm part. The specific changes in variables are summarized in Table 1.
In addition to these variables, three parameters n, p, and λ are introduced to each ensemble in the generative model, whose meanings are described in the following. For each parameter, Dirichlet or beta distribution is assumed as prior distribution to facilitate marginalization in Bayesian inference.
• n: affinity parameter to a neuronal ensemble, following Dirichlet distribution • p: activity parameter of a neuronal ensemble, following beta distribution • λ: synchronization parameter between the activities of a neuronal ensemble and neurons within the ensemble, following beta distribution More precisely, two kinds of λ's are introduced for synchronization; λ 1 or λ 0 is the synchronization when both of ensemble and neuron are active or inactive, respectively. The distribution of parameters in this model is expressed as follows.
Here a ðnÞ m is the hyperparameter of the prior Dirichlet distribution on neuron assignment, which represents the frequency of neuron assignment to each ensemble. a ðpÞ m and b ðpÞ m are the hyperparameters of the prior beta distribution, which represent the frequency of ensemble activity and inactivity, respectively. a ðlÞ 1;m and b ðlÞ 1;m are the hyperparameters of the prior beta distribution, which represent the frequency of active and inactive neurons when the belonging ensemble is active, respectively. a ðlÞ 0;m and b ðlÞ 0;m are the hyperparameters of the prior beta distribution, which represent the frequency of active and inactive neurons when the belonging ensemble is inactive, respectively.
The likelihood is designed as follows to facilitate marginalization like the design of prior distribution.
Pðt; ω; sjn; p; lÞ / The meanings of these quantities are the same as the hyperparameters, which have already been defined without tildes. Note that they are defined by the variables t, ω, and s. For inference, we need the joint distribution of t, ω, and s under given hyperparameters. In BIM, the above likelihood was expressed in the form of Kronecker delta, since t, ω, and s take discrete values. The correspondence between the variables in BIM and ours is shown in Table 2. As can be seen from the table, the likelihood in our model is the generalization to continuous variables, which also includes binary and categorical cases. By product rule in Bayesian statistics and marginalization of the parameters n, p, λ, the following expression is obtained, where B(x 1 , x 2 ) is beta function and Bðx 1 ; . . . x A Þ is multivariate beta function represented as with Γ(x) being gamma function. For inference of the structure of the functional neuronal ensemble and activity of the ensemble, namely inference of t and ω, the marginalized expression (13) is used for improvement of inference accuracy and reduction of computational cost.
In more detail, Dirichlet and beta distributions are chosen as the conjugate prior both in BIM and in our model so that the forms of the prior and posterior distributions are the same. In addition, the form of prior distribution is designed to remain unchanged after update of the hyperparameters. Such design enables us to integrate out the parameters n, p, and λ analytically as in Eq (13), which yields probability model expressed only by hyperparameters. Consequently, computational cost and estimation error can be reduced, because the parameters n, p and λ can be eliminated by analytical integration, and accordingly the numerical integration of these parameters is not necessary for the inference.
To summarize, we compare the two models: BIM and ours. In our model, the variable range of neuronal state is extended by the continuous s and ω, which leads to more detailed representation of the similarity between activities of neurons and ensemble. Membership weight t is generalized to continuous-valued vector variable for expressing the certainty of membership assignment to target ensemble. By changing t to continuous-valued, our generative model allows soft clustering for expressing overlap in membership assignment, and accordingly multiple roles of neurons can also be expressed. Moreover, the likelihood in BIM is generalized to improve the expressivity of the model. Consequently, our model is very effective not only for stationary activity data with the fixed functional role of neuron but also for the data of non-stationary activity, the data with multiple functional roles for each neuron, or the data under the experiment of low reproducibility. Additionally, MCMC algorithm in BIM is also changed for estimation of continuous variables. The changes in the algorithm will be described in the following.

MCMC and Dirichlet process
In our method, the structure of the neuronal ensemble and activity of ensemble are inferred from the solution of posterior maximization by applying MCMC method to Eq (13) similarly to BIM [10]. In addition, the number of neuronal ensembles is dynamically changed by Dirichlet process in MCMC. Dirichlet process is a stochastic process that can generate arbitrary discrete distributions, and the probability of each event is determined by concentration parameter [15]. For speeding up the inference by MCMC, parallel computation using synchronous update can be implemented.
The variables to be inferred are t and ω in our model, and the variable updates for t and ω are performed alternately until convergence in MCMC. The procedure of variable update is summarized as follows.
1. Ensemble activity ω is updated according to Dirichlet process of order two (or beta process) and Metropolis-Hastings method in MCMC.

Value of membership weight t is updated according to Dirichlet process and Metropolis-Hastings method in MCMC.
3. Hyperparameters are updated using updated ω and t by the processes 1 and 2.
The update rule of membership weight t i = {t 1i , . . ., t Ai } is explained in the following, which differs from BIM due to the change of the generative model. Other variables are updated in the same manner as in BIM.
In our model, the element of membership weight t μi is a continuous variable satisfying normalization condition P A m¼1 t mi ¼ 1. Hence, for update of membership weight t i , all elements in t i must be updated unlike Dirichlet process for one-hot representation of t i in BIM. Let t 0 i be the original membership weight of neuron i before update and t * i be the proposed membership weight for update, which may or may not be accepted by Metropolis-Hastings method. Here we define concentration ensemble G * i for neuron i in each iteration of MCMC. The values in elements of the proposed membership weights t * i will concentrate on the ensemble G * i by Dirichlet process with transition parameter α (t) . More precisely, in Dirichlet process the concentration ensemble G * i is determined first, next the weight t G * i i is increased by α (t) and the weights of other ensembles remain unchanged, then finally all elements of the proposed weight are normalized so as to satisfy the condition of probability. In the case of hard clustering as in BIM, such normalization is not necessary because of one-hot representation of the weight.
The concentration ensemble G * i is determined according to the transition probability in Dirichlet process as in the following Eq (16). The probability for the concentration ensemble G * i is proportional to the size of the ensemble G * i , that is, the sum of the proposed weights of the neurons belonging to the ensemble G * i [15,16].
Then, the proposed weights are computed as in Eq (17), which includes the increase of the weight t G*i and the normalization.
Conversely, to satisfy the detailed balance condition in MCMC, the reverse process for the determination of concentration ensemble must be considered. The probability of concentration ensemble before update, denoted by G 0 i , is expressed by Eq (18) under given membership weight after update, namely t * i .
where backslash means removal of specific element, t 0 Here we should comment on the difference in the inference of membership weight t between BIM and our model for later convenience. In BIM, the membership weight t is a categorical variable, therefore the concentration ensemble G * i is not necessary in determining the transition destination, and the transition probability to each ensemble is defined in proportion to the size of ensemble. Since there may be a transition to a new ensemble according to Dirichlet process, the cases of transitions to an existing ensemble and to new ensemble should be described separately. Hence, in BIM, the transition probabilities under the weight t in one-hot representation are given as follows.
where α new represents concentration parameter in Dirichlet process, and larger value of α new results in larger A. In our model, α new is set to 0 because there is no upper bound for A due to soft clustering.
We go back to the description of our model. The ratio of probabilities for the concentration ensembles G * i ; G * 0 is given by The derived probabilities ratio is represented by the ratio of the ensemble sizes. Remember that t * i is the proposed membership weight, and for acceptance of the proposed weight it is necessary to calculate the acceptance rate in Metropolis-Hastings method. The acceptance rate of the proposed weight t * i under the original weight t 0 i is given as where hyperparameters in P are omitted and For the determination of the transition destination of ω, it is not chosen probabilistically but always an inverted value of the current activity state, because ω is a binary variable taking 0 or 1. Furthermore, since the value of ω after the transition is also binary, no special change is required after transition. In contrast, ω takes continuous value in our method, therefore it is necessary to change a decision rule for the transition destination. In addition, for normalization, it is necessary to consider how to change the value of ω after determining the transition destination.
Based on the fact that beta distribution is a special case of Dirichlet distribution, namely Dirichlet distribution of order two, we design the decision rule for the transition destination of ω and its value after the transition by considering the difference of update rule on t between BIM and ours. Here we give the constraint that the number of dimensions does not increase, as for t in BIM in Eq (19). As a result, the transition destination is determined in proportion to the sum of the current activity values, with the value of the concentration parameter being set to 0.
For the value of ω after the transition, normalization is applied so that the sum of the activity values is 1. Here the transition parameter α (w) is introduced as in the case of t in Eq (18), and the activity value of transition destination is increased by α (w) .
Finally, acceptance rate is expressed by the formula in Eq (25).
Pðt; ω * km ; ω 0 m;nk ; sÞ Pðt; ω 0 ; sÞ Once the variables are updated, the hyperparameters are also updated using updated ω and t. The hyperparameters to be updated are a ðnÞ m , which is related to membership of neuron, where the symbols with tilde are defined in Eqs (6)- (12) and the ones with hat are updated hyperparameters. Finally, we mention the computational cost of our proposed method. The computational cost of our method is larger than the original BIM by a factor of the number of clusters. However, in our analysis, the number of clusters is supposed to be relatively small. In addition, the computational cost of the original BIM is not so large. Therefore, the computational cost of our method is not a problem in practice.

The model of fluorescence intensity
We conduct numerical analysis for the effectiveness and the validity of our proposed method by applying to continuous-valued data. The synthetic activity data for validation of the proposed method is generated under consideration of experimental fluorescence intensity by calcium imaging. In calcium imaging, fluorescent protein binds to calcium ions and emits light, and its activity is observed as light intensity. There is a chemical nonlinear relation between neuronal activity and fluorescence intensity. This relation depends on calcium ion concentration and the type of fluorescent protein, and nonlinearity is controlled by the parameters in general.
In our study, leaky integrated-and-fire model is used to generate the synthetic data. This mathematical model is for representing time series data of electrical intracellular membrane potentials. In this model, the relation among inflow currents from other neurons via synaptic and gap junctions, the external stimulus currents, and the time variation of membrane potential is expressed as differential equations with additional noise term, where the definitions of variables and parameters are given in Table 3.
The input currents via chemical synaptic junction and gap junction are determined by the chemical synaptic junction weight w ðchemÞ ij and the gap junction weight w ðgapÞ ij , respectively. The current via chemical synaptic junction is given by the logistic function of membrane potential in Eq (34). For the current via the gap junction, the weight w ðgapÞ ij serves as a constant resistance in Eq (35). To summarize, these currents are expressed as where V (half) and V (width) represent the position of half value and half width of the logistic function, respectively. For chemical synaptic connection, the sign of w ðchemÞ ij is positive for excitatory connection and negative for inhibitory, respectively. In numerical analysis, these equations are implemented by Euler-Maruyama method to generate time series signals of membrane potentials. Next, the model for converting membrane potential to fluorescence intensity is described. The time derivative of fluorescence intensity in the calcium imaging is given by the following differential equation, where t ðFÞ i is the time constant of the fluorescent protein, ½Ca 2þ � it is the intracellular calcium ion concentration of the ith neuron at time τ, and F iτ is the fluorescence intensity of the ith neuron at time τ. The nonlinear function for calcium ion concentration in Eq (36) is expressed Table 3. Variables and parameters in leaky integrated-and-fire model. by sigmoid-like function, which is called Hill's equation as where F (max) is the maximum of fluorescence intensity, K (D) and h are parameters for controlling the shape of sigmoid-like function. The chemical response properties of fluorescent proteins to changes in calcium concentration are described in Ref [17]. We used the function in Eq (37) by referring it.
In generating the data of fluorescence intensity, the relation between calcium ion concentration and membrane potential is simplified in our work. Namely, we assume that calcium ion concentration is directly proportional to membrane potential as ½Ca 2þ � it / V it (we set the constant of proportionality unity in the following), which leads to the differential equation for fluorescence intensity in the following form, In applying our proposed method to synthetic fluorescence intensity practically, the fluorescence intensity data is rescaled toF it as follows, whose value is in the range [0, 1].

Generation of synthetic data
In our numerical analysis, we generate synthetic data with N = 100, M = 2000, A = 8, and with "structural" ensembles E 1 * E 8 defined by the geometry of structural connections. Two structural ensembles E 1 and E 5 have 20 neurons and others have 10 neurons. The values of parameters in leaky integrated-and-fire model are common to all neurons as in Table 4. For fluorescent protein, we set parameters for chemical reaction by considering specific fluorescent protein. Neurons in the structural ensembles E 1 * E 4 respond faster than Table 4. Values of parameters in leaky integrated-and-fire model: The unit of each quantity can be chosen auxiliary in this numerical analysis. In practical application, the unit can be chosen appropriately to match the experimental condition.  is smaller. Between ensembles, structural connections are given probabilistically as synaptic connection and gap junction. Excitatory synaptic connections are given in the ensemble pairs, E 1 ! E 2 and E 6 ! E 5 . Similarly, inhibitory synaptic connections are given in the pairs E 1 ! E 3 and E 7 ! E 5 . There is no synaptic connection between ensembles except for the above-mentioned pairs. In addition, excitatory synaptic connections and symmetric gap junctions are given probabilistically between neurons in the same ensemble. The connection probabilities are given in Table 5. As a result, we obtain the connection matrices of chemical synaptic connection and gap junction between neurons as shown in Fig 1. In out setting, the reason for introducing disconnected structural ensembles E 4 and E 8 is as follows. As mentioned in the introduction, our method estimates functional neuronal ensembles only from synchronization of activity data. Therefore, it cannot directly find the possible background structural connection between neurons. Even though the result of estimation for  functional ensembles reflects the geometry of structural ensembles, functional and structural ensembles are not always the same. To demonstrate the difference between functional and structural ensembles, we intentionally use the same input currents to obtain similar neuronal activities for prepared structural ensembles, where some ensembles have inter-ensemble connections and others do not.
In addition, due to the difference between functional and structural ensembles as mentioned above, the exact structural ensembles cannot be found by our method. Hence, it is not necessary to introduce detailed structure of neuronal network geometry in the numerical analysis. For investigating the basic properties of the proposed method, we use simple neuronal network geometry with 8 structural ensembles in our numerical analysis.
For investigating stationary and non-stationary activities, we generate activity data with constant and time-varying input currents. Specifically, we generate three time series data, which are shown in Fig 2: the data under the same and stationary input current to all neurons as in 1 in Fig 3,  We should compare the result of our numerical analysis with BIM, however application of the inference method in BIM is limited to binary data. Therefore, when applying two methods to synthetic continuous-valued data, raw continuous data is used in our proposed method and the binarized one is used in BIM. Binarization is performed using Peakfind function in MATLAB, where local maximum of time-series signal is regarded as spike. More precisely, Peakfind function can find local maximum value of a sample point in a data series, which is  Tables 4 and 5. In contrast, B1, B2, and B3 are the data after binarization using Peakfind function, which are for the numerical analysis of BIM. In B1, B2, and B3, the activities in the same structural ensemble are also similar as in A1, A2, and A3. Therefore we consider there is no problem with data preprocessing by binarization. Furthermore, it can be observed that the  activities of neurons in the downstream structural ensembles, namely E 2 , E 3 , E 5 in our setting, tend to be disturbed by the neurons in the upstream ensembles, E 1 , E 6 , E 7 . More precisely, synchronization of neuronal activities in the downstream ensemble tends to be disrupted due to the activity of neurons in the upstream ensemble, as can be seen in A1, A2, and A3 in Fig 4.

Results of the application and discussion
As mentioned before, the purpose of our method is to infer functional neuronal ensembles, and it does not matter whether the structural neuronal ensembles defined by structural connections can be estimated or not. In other words, there is no ground-truth solution. Therefore, we attempt to capture neuronal ensembles with similar activities under various "granularities", namely scales of ensembles. We emphasize that our soft clustering approach allows ensemble inference at different granularities, which is not possible in BIM due to hard clustering. Since our goal is to estimate the ensembles at different granularities, we cannot judge the superiority or inferiority of ensemble inference performance using a metric such as the distance from true structural connection. Hence, we display the results visually and discuss them. Additionally, stationary and non-stationary data are used to show that even non-stationary data can be handled by our soft clustering method.
We apply two methods to synthetic data: our proposed method and BIM. For the performance of functional ensemble inference, the average of multiple trials with different initial conditions is taken, because the dependence of MCMC result on initial condition should be removed. In BIM, the number of initial ensembles is set to half the number of neurons as recommended in [10], which is to avoid convergence to an inappropriate local solution. Other conditions are the same in two methods. The detailed conditions of the numerical analysis are summarized in Table 6.
For the behavior by multiple trials, we observe how often each neuron is classified into the same functional ensemble, for which we define the similarity matrix U. The element of U represents the similarity of activities between neurons, which has the information of functional ensembles averaged over multiple trials. Element of the similarity matrix is expressed as where superscript r is the label of trial number with initial condition being changed.
By comparing A3, B3, C3, and D3 in Fig 5 under different and non-stationary input current as in 3 in Fig 3, the resulting functional ensembles are changed by the value of the transition parameter α (t) . For example, when focusing on neurons i = 1 * 30 in the structural ensembles E 1 and E 2 , the coarse-grained large functional ensemble is obtained under small α (t) as in A3 in Fig 5, where the original two structural ensembles E 1 and E 2 are merged into one functional ensemble. In contrast, smaller functional ensembles are obtained under large α (t) as in D3 in Fig 5, where the structural ensembles E 1 and E 2 are separated. Furthermore, in our method, we observe that the neurons in different structural ensembles, for example the neurons in the In our method, the inference of functional ensemble is based only on synchronization of activity. Thus, even if there is no structural connection, these neurons with similar activity are classified into the same functional ensemble as a consequence. In contrast, when the method in BIM is applied to the activity data after binarization, only the course-grained functional ensembles can be found as in A3 in Fig 6. It is difficult to separate the original structural ensembles with similar activity by this method, because this method can only be used as hard clustering and does not have an extra parameter to control hard/soft clustering like the transition parameter in our proposed method.
We also observe the dependence on the input current. In our proposed method, when the activities in all neurons are similar as in A1 in Fig 4,   grained functional ensembles. The differences among ensembles in the same group can be found in more detail as we switch toward hard clustering (or large α (t) ). This behavior can be observed in A1, B1, C1, and D1 in Fig 5. In contrast, in the case of A2 in Fig 5, difference in information flow between ensembles due to non-stationary input current leads to clear difference in activity. In this case, a large functional ensemble is confirmed as the merged structural ensemble groups {E 3 , E 4 , E 6 , E 7 , E 8 }, which have similar activities. As we switch closer to hard clustering, the hierarchy can be found: the smaller functional ensembles of {E 3 , E 6 } and {E 4 , E 7 , E 8 } (see D2 in Fig 5). On the other hand, the structural ensembles in the groups of {E 3 , E 6 } or {E 4 , E 7 , E 8 } cannot be separated anymore, where the activities of neurons highly synchronize in each group.
While BIM can estimate course-grained functional ensemble, it cannot find the hierarchy of ensembles confirmed by our method from the results in A1, A2, and A3 in Fig 6. Another feature is that the presence or absence of stationarity has little influence on the results by BIM. This may be due to hard clustering or ensemble inference by time-averaged activity, because hard clustering or time-averaged activity cannot appropriately capture the temporal change of the feature in the activity.
Both of BIM and our proposed method determine functional ensembles based on synchronization of activity. Therefore, the activation timings must coincide among neurons to be regarded as synchronous activity in both models. Nevertheless, our method has the advantage for functional ensemble inference. There are upstream and downstream information transmissions through connections. The downstream neurons are activated later than the upstream ones, and there is always a time gap even in the activity of synchronous neurons. In binary data, the perfect coincidence of activation timing (namely spike timing) is required for identifying synchronization. Hence, it is difficult to find synchronization due to the presence of time gap. In contrast, in continuous-valued data, the activity data before and after the maximum activity value (or spike) can also be used to identify synchronization. Therefore, the model of continuous activity can identify synchronization more appropriately than the binary model. We guess that this is the reason that the functional ensemble in Fig 6 by BIM becomes always similar regardless of the feature of external input current.
Finally, for comparison between the results by our proposed method and BIM, dendrograms for hierarchical structure of functional ensembles are depicted in Fig 7, where the data under non-stationary input current (3 in Fig 3) is used. For dendrograms, similarity matrices U in Figs 5 and 6 are used for computing distance between neurons. In addition, we also define similarity matrixŨ for input signal, namely neuronal activity s as  tuning transition parameter α (t) . Obviously, the structures of dendrograms A, B, C, and D are similar to X. This means that the hierarchical structure of function ensembles by our proposed method reflects the hierarchical structure of input activity data correctly, which validates the result by our proposed method. For small α (t) or soft clustering case, the distance between neurons tends to be small. This implies that membership weight t of individual neuron spreads over many ensembles, and the number of neurons in one ensemble tends to be large. In contrast, for large α (t) or hard clustering case, distance between neurons tends to be large. In this case, the membership weight t concentrates on a specific ensemble, and the number of neurons in one ensemble tends to be small. Such difference between membership weights leads to the difference between dendrograms. Namely, hierarchical structure of dendrogram can be controlled by transition parameter α (t) . We also depict the dendrogram Y by the result of BIM, where the hierarchical structure is not similar to X. In addition, the result of BIM cannot be controlled by tuning parameter like α (t) . This implies that BIM may not be able to infer functional ensembles appropriately in some cases. We think that this fact also supports the advantage of our proposed method.

Conclusion
In this study, we extended the inference model for functional neuronal ensemble to be applicable regardless of data format and stationarity. The purpose of our proposed method is to classify neurons into functional ensembles in large-scale activity data acquired by experimental method such as calcium imaging. For this reason, no restriction on the format of the activity or no assumption of stationarity due to physiological experimental conditions is desirable. Therefore, our proposed method without restriction on format or stationarity assumption will work effectively and can be widely used as a method of data preprocessing. By applying our proposed method to identical synthetic data multiple times, we confirmed that it converges to a reasonable and sufficiently stable solution of functional ensembles, although our method has dependence on initial condition in MCMC. In addition, by comparing our method with BIM, we believe that our method is more useful to obtain functional ensembles and their mutual relation by adjusting the transition parameter α (t) .
The inference for functional neuronal ensembles in our study or BIM can be considered as preprocessing for clarifying geometry of functional neuronal network, whose target size is recently increasing [10]. Therefore, functional network inference with the aid of information of functional ensemble is a future topic of our study. As existing methods of neuronal network inference, the method using spin-glass model, which is to describe the ordered states of magnetic materials with impurities in the field of statistical physics [3][4][5], and the method of graph analysis for graphical representation of similarity between neurons [18][19][20] are known for example. However, stationarity of network structure is assumed in many network inference methods. In addition, input neuronal activity is often limited to binary in these methods, and they cannot be applied to continuous-valued data such as fluorescence intensity. Therefore, they are not sufficient as models to express functional connections between neurons with nonstationarity. For the use of our proposed method as preprocessing of network inference, the first issue to be considered is to generalize the network inference method without restriction of data format or stationarity assumption of input activity.  Fig 2. The row represents the label of neuron, the column represents the label of time step, and the value in each cell represents the magnitude of normalized fluorescence intensity. We used this data for the analysis of our proposed method. (CSV) S11 Data. Original data in B2 in Fig 4. This csv file represents binarized data of S10, as shown in B2 in Fig 4. The row represents the label of neuron, the column represents the label of time step, and the value of each cell represents the inactive state for 0 and the active state for 1. (CSV) S12 Data. Original data in A3 in Fig 2. This csv file represents membrane potential generated by leaky integrated-and-fire model, as shown in A3 in Fig 2. Here all neurons are given the same and stationary input current. The row represents the label of neuron, the column represents the label of time step, and the value in each cell represents the magnitude of membrane potential. (CSV) S13 Data. Original data in B3 in Fig 2. This csv file represents fluorescence intensity generated from S12 data, as shown in B3 in Fig 2. The row represents the label of neuron, the column represents the label of time step, and the value in each cell represents the magnitude of fluorescence intensity. (CSV) S14 Data. Original data in C3 in Fig 2. This csv file represents fluorescence intensity after normalization of S13 to the range [0, 1], as shown in C3 in Fig 2. The row represents the label of neuron, the column represents the label of time step, and the value in each cell represents the magnitude of normalized fluorescence intensity. We used this data for the analysis of our proposed method. (CSV) S15 Data. Original data in B3 in Fig 4. This csv file represents binarized data of S14, as shown in B3 in Fig 4. The row represents the label of neuron, the column represents the label of time step, and the value of each cell represents the inactive state for 0 and the active state for 1. (CSV) S1 File. The program for synthetic activity data. This program generates synthetic neuronal activity data used in this numerical analysis (for C/C++), which outputs potential (Result_V. csv) using the parameters specified in the program, fluorescence intensity (Result_F.csv) based on membrane potential, synaptic connection weight between neurons (Result_WChem.csv), and gap junction weight (Result_WGap.csv). (CPP)